Alignment with Non-overlapping Inversions in O(n3)-Time
نویسندگان
چکیده
Alignments of sequences are widely used for biological sequence comparisons. Only biological events like mutations, insertions and deletions are usually modeled and other biological events like inversions are not automatically detected by the usual alignment algorithms. Alignment with inversions does not have a known polynomial algorithm and a simplification to the problem that considers only non-overlapping inversions were proposed by Schöniger and Waterman [20] in 1992 as well as a corresponding O(n) solution. An improvement to an algorithm with O(n log n)-time complexity was announced in an extended abstract [1] and, in this present paper, we give an algorithm that solves this simplified problem in O(n)-time and O(n)-space in the more general framework of an edit graph. Inversions have recently [4, 7, 13, 17] been discovered to be very important in Comparative Genomics and Scherer et al. in 2005 [11] experimentally verified inversions that were found to be polymorphic in the human genome. Moreover, 10% of the 1,576 putative inversions reported overlap RefSeq genes in the human genome. We believe our new algorithms may open the possibility to more detailed studies of inversions on DNA sequences using exact optimization algorithms and we hope this may be particularly interesting if applied to regions around known rearrangements boundaries. Scherer report 29 such cases and prioritize them as candidates for biological and evolutionary studies.
منابع مشابه
An O(n4) algorithm for alignment with non-overlapping inversions
Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kind of rearrangements. Despite many import...
متن کاملA sparse dynamic programming algorithm for alignment with non-overlapping inversions
Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kind of rearrangements. Despite many import...
متن کاملAlignments with Non-overlapping Moves, Inversions and Tandem Duplications in O ( n 4) Time
Sequence alignment is a central problem in bioinformatics. The classical dynamic programming algorithm aligns two sequences by optimizing over possible insertions, deletions and substitution. However, other evolutionary events can be observed, such as inversions, tandem duplications or moves (transpositions). It has been established that the extension of the problem to move operations is NP-com...
متن کاملEfficient string-matching allowing for non-overlapping inversions
Inversions are a class of chromosomal mutations, widely regarded as one of the major mechanisms for reorganizing the genome. In this paper we present a new algorithm for the approximate string matching problem allowing for non-overlapping inversions which runs in O(nm) worst-case time and O(m2) space, for a character sequence of size n and pattern of size m. This improves upon a previous O(nm2)...
متن کاملEfficient Matching of Biological Sequences Allowing for Non-overlapping Inversions
Inversions are a class of chromosomal mutations, widely regarded as one of the major mechanisms for reorganizing the genome. In this paper we present a new algorithm for the approximate string matching problem allowing for non-overlapping inversions which runs in O(nm) worst-case time and O(m)-space, for a character sequence of size n and pattern of size m. This improves upon a previous O(nm)-t...
متن کامل